Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf. (ˆθn θ (P ) under P. For real ˆθ n, (τ n

Similar documents
Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf

THE INTEGRAL TEST AND ESTIMATES OF SUMS

Summary. Recap. Last Lecture. Let W n = W n (X 1,, X n ) = W n (X) be a sequence of estimators for

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

13.1 Shannon lower bound

1 Review and Overview

tests 17.1 Simple versus compound

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Problem Set 4 Due Oct, 12

18. Two-sample problems for population means (σ unknown)

Basics of Inference. Lecture 21: Bayesian Inference. Review - Example - Defective Parts, cont. Review - Example - Defective Parts

Lecture 33: Bootstrap

ECE534, Spring 2018: Solutions for Problem Set #2

ECE534, Spring 2018: Final Exam

6.3.3 Parameter Estimation

Introductory statistics

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Chapter 2 Transformations and Expectations

32 estimating the cumulative distribution function

John H. J. Einmahl Tilburg University, NL. Juan Juan Cai Tilburg University, NL

6.3 Testing Series With Positive Terms

STAT-UB.0103 NOTES for Wednesday 2012.APR.25. Here s a rehash on the p-value notion:

STATISTICAL INFERENCE

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Testing for Convergence

BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH

Fall 2013 MTH431/531 Real analysis Section Notes

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

This section is optional.

Solutions to Practice Midterms. Practice Midterm 1

Mathematical Statistics - MS

Hybridized Heredity In Support Vector Machine

FINAL EXAMINATION IN FOUNDATION OF ANALYSIS (TMA4225)

Economics 326 Methods of Empirical Research in Economics. Lecture 18: The asymptotic variance of OLS and heteroskedasticity

1.010 Uncertainty in Engineering Fall 2008

Empirical likelihood for parametric model under imputation for missing

Notes on the prime number theorem

p we will use that fact in constructing CI n for population proportion p. The approximation gets better with increasing n.

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Lecture 19: Convergence

1 Introduction to reducing variance in Monte Carlo simulations

TAMS24: Notations and Formulas

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

Asymptotic Results for the Linear Regression Model

( ) = is larger than. the variance of X V

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

Statistical Theory MT 2009 Problems 1: Solution sketches

Random Variables, Sampling and Estimation

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Logit regression Logit regression

Introduction to Probability and Statistics

Final Solutions. 1. (25pts) Define the following terms. Be as precise as you can.

Stat410 Probability and Statistics II (F16)

Lecture 12: September 27

Research Article New Bandwidth Selection for Kernel Quantile Estimators

Efficient GMM LECTURE 12 GMM II

Frequentist Inference

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

UNIFORM RATES OF ESTIMATION IN THE SEMIPARAMETRIC WEIBULL MIXTURE MODEL. BY HEMANT ISHWARAN University of Ottawa

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

Probability and Statistics

The Hong Kong University of Science & Technology ISOM551 Introductory Statistics for Business Assignment 3 Suggested Solution

x iu i E(x u) 0. In order to obtain a consistent estimator of β, we find the instrumental variable z which satisfies E(z u) = 0. z iu i E(z u) = 0.

Estimation with Overidentifying Inequality Moment Conditions Technical Appendix

Lecture 13: Maximum Likelihood Estimation

1. Do the following sequences converge or diverge? If convergent, give the limit. Explicitly show your reasoning. 2n + 1 n ( 1) n+1.

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Last Lecture. Biostatistics Statistical Inference Lecture 16 Evaluation of Bayes Estimator. Recap - Example. Recap - Bayes Estimator

MA Advanced Econometrics: Properties of Least Squares Estimators

Elementary Statistics

Stat 200 -Testing Summary Page 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

= p x (1 p) 1 x. Var (X) =p(1 p) M X (t) =1+p(e t 1).

SDS 321: Introduction to Probability and Statistics

In this section, we show how to use the integral test to decide whether a series

Statistical Theory MT 2008 Problems 1: Solution sketches

Chapter 6 Infinite Series

Common Large/Small Sample Tests 1/55

Homework for 2/3. 1. Determine the values of the following quantities: a. t 0.1,15 b. t 0.05,15 c. t 0.1,25 d. t 0.05,40 e. t 0.

Sequences. Notation. Convergence of a Sequence

of the matrix is =-85, so it is not positive definite. Thus, the first

Solutions to Problem Sheet 1

1 Covariance Estimation

Properties and Hypothesis Testing

Proposition 2.1. There are an infinite number of primes of the form p = 4n 1. Proof. Suppose there are only a finite number of such primes, say

A Central Limit Theorem for Belief Functions

Last Lecture. Wald Test

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

MAT1026 Calculus II Basic Convergence Tests for Series

Solutions to HW Assignment 1

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Summary. Recap ... Last Lecture. Summary. Theorem

ST5215: Advanced Statistical Theory

1 Probability Generating Function

Transcription:

Subsamlig vs Bootstra Dimitris N. Politis, Joseh P. Romao, Michael Wolf R x, θ P = τ ˆθ θ P Examle: ˆθ = X, τ =, θ = EX = µ P orˆθ = mi X, τ =, θ P = su{x : F x 0} Defie: J P, the istributio of τ ˆθ θ P uer P. For real ˆθ, J x, P P rob P τ ˆθ θ P x Sice P is ukow, θ P is ukow, a J x, P is also ukow. The bootra estimate J x, P by J x, ˆP, where ˆP is a cosistet estimate of P i some sese. For examle, take hatp x = i= X i x the emirical istributio: su x ˆP x P x 0 Similarly estimate αth quatile of J x, P by J x, ˆP : i.e. Estimate J x, P by x, ˆP. J Usually J x, ˆP use Mote Carlo aroximatio: for ˆθ,i = ˆθ X,i,..., X,i. a.s. ca t be exlicitly calculatealthough i some simle case it ca be, J x, ˆP B B i= τ ˆθ,i ˆθ x Whe bootstra worksthe meaig of works, for each x, J x, ˆP J x, P α, ˆP = J J α, P 0 0 Whe shoul Bootstra work? Nee local uiformity i weak covergece:

. Usually J x, P J x, P. 2. Also usually ˆP P a.s. i some sese, say su x ˆP x P x 0. a.s. 3. Suose for each sequece P s.t. P P, say su x P P 0, it is also true that J x, P J x, P, the it must be true that a.s. J x, ˆP J x, P 4. So it es u havig to show for P P, J x, P J x, P, use triagular array formulatio. Case whe it works: samle mea with fiite variace. It is kow that:. su x ˆF x F x 0. a.s. 2. θ ˆF = i= X a.s. i θ F = EX. 3. σ 2 ˆF = i= Xi X 2 a.s. σ 2 F = V arx. 4. Use Lierberg-Feller for the triagular array, alie to the etermiistic sequece of P such that: su x P x P x 0; 2 θ P θ P ; 3 σ 2 P σ 2 P, it ca be show that X θ P N 0, σ 2 uer P. 5. Sice ˆP satisfies,2,3 a.s., therefore a.s. J x, ˆP J x, P. Therefore local uiformity of weak covergece is satisfie here. Cases whe bootstra fails:. Orer Statistics: F U 0, θ, a X,..., X is the orer statistics of the samle, so X is the maximum: P θ X θ The bootstra versio: > x = P X < θ θx = P P X X /X = 0 = X i < θ θx = = θ x e 0.63 θ θx e x 2

2. Degeerate U-statistics: Take w x, y = xy, θ F = w x, y F x F y = µ F 2. If µ F 0 it is kow that The bootstra works. ˆθ = θ ˆF = X i X j i j S x = xyf y = xµ F ˆθ θ N 0, 4V ar S X = N 0, 4 µ 2 EX 2 µ 4 But if µ F = 0 = θ F = 0: θ ˆF = X i X j = X 2 i j θ ˆF θ F = X 2 S 2 N 0, σ 2 σ 2 [ However the bootstra versio of θ ˆF ] θ ˆF : i Xi X 2 = X2 S2 [ X 2 ] [ S 2 X 2 ] S2 2 = X S 2 X 2 + S 2 X 2 X 2 = [ X X ] 2 + 2 X X X N 0, σ 2 2 + 2N 0, σ 2 X Subsamlig: ii case: Y i block of size b from X,..., X, i =,..., q, for q = b. Let ˆθ,b,i = ˆθ Y i calculate with the ith block of ata. Use the emirical istributio of τ b ˆθ,b,i ˆθ over the q seuo-estimates to aroximate the istributio of τ ˆθ θ : Aroximate by J x, P = P τ ˆθ θ x L,b x = q i= τ b ˆθ,b,i ˆθ x 3

Claim: If b, b/ 0, τ b /τ 0, as log as τ ˆθ θ somethig, J x, P L,b x 0 Differet motivatio for Subsamlig vs. Bootstra: Subsamlig: each subset of size b comes from the TRUE moel. Sice τ ˆθ θ J x, P, so as log as b : τ b ˆθb θ J x, P For large, the istributios of τ ˆθ θ a τ b ˆθb θ shoul be close. But Sice τ b ˆθb θ = τ b ˆθb ˆθ + τ b ˆθ θ τb τ b ˆθ θ = O τ = o The istributios of τ b ˆθb θ a τ b ˆθb ˆθ shoul be close. The istributio of τ b ˆθb ˆθ is estimate by the emirical istributio over q = b seuo-estimates. Bootstra: Recalculate the statistics from the ESTIMATED moel ˆP. Give that ˆP is close to P, hoefully J x, ˆP is close to J x, P Or to J x, P, the limit istributio. But whe bootstra fails ˆP P J x, ˆP J x, P Formal Proof of cosistecy of subsamlig: Assumtios: τ ˆθ θ Nee to show: L,b x J x, P 0. Sice τ θ θ J x, P,b, b 0, τ b τ 0. 0, it is eough to show U,b x = q i= τ b ˆθ,b,i θ x J x, P U,b x J x, P = U,b x EU,b x + EU,b x J x, P 4

Eough to show U,b x EU,b x 0 a EU,b x J x, P 0 But EU,b x J x, P = J b x, P 0 U,b x is a bth orer U-statistics with kerel fuctio boue by,. Use Hoeffig exoetial-tye iequalityserflig980, Thm A. 20: P U,b x J b x, P ɛ ex 2 b ɛ2 / [ ] = ex b t2 0 as b So. L,b x J x, P = L,b x U,b x + U,b x J b x, P + J b x, P J x, P Q.E.D. Time Series!: Resect the orerig of the ata to reserve correlatio. ˆθ,b,t = ˆθ b X t,..., X t+b, q = T b +. 0. L,b x = q i= τ b ˆθ,b,t ˆθ x Assumtio: τ ˆθ θ J x, P, b, b 0, τ b τ 0, α m 0. Result: L,b x J x, P 0. Most ifficult art: To show τ ˆθ θ J x, P. Ca treat ii ata as time series, or eve usig o-overlaig blocks k = [ ] b, but usig b more efficiet. For examle, if Ū x = k k j= τ b [R,b,j θ P ] x the U,b x = E [ Ū x X \ ] = E [ τb [R,b,j θ P ] x X ] for X = X,..., X. U,b x is better tha Ū x sice X is sufficiet statistics for ii ata. 5

Hyothesis Testig: T = τ t X,..., X, G x, P = P rob τ x P P 0 J x, P Ĝ,b x = q T,b,i x = q τ b t,b,i x i= i= As log as b, b 0, the uer P P 0: Ĝ,b x G x, P If uer P P, T, the x, Ĝ,b x 0. Key ifferece with cofiece iterval: o t ee τ b τ θ 0 but assume kow uer the ull hyothesis. 0, because o t ee to estimate Estimatig the ukow rate of covergece: Assume that τ = β, for some β > 0, but β is ukow. Estimate β usig ifferet size of subsamlig istributio. Key iea: Comare the shae of the emirical istributios of ˆθ b ˆθ for ifferet values of b to ifer the value of β. Let q = b for ii ata, or q = T b + for time series ata: This imlies L,b x τ b q L,b x q a= τ b ˆθ,b,a ˆθ x a= ˆθ,b,a ˆθ x L,b x τ b = L,b τ b x t Sice L,b x τ b Same as x = L,b t τ b = τ b τ b x = τ b L,b t J x, P, if J x, P is cotiuous a icreasig, it ca be ifere that L,b t τ b = J t, P + o τ b L,b t = J t, P + o 6

So b β L,b t = J t, P + o take logassumig J t, P > 0, or t > J 0, P, for ifferet b a b 2, the this becomes Differet out the fixe effect So estimate β by β log b + log L,b t = log J t, P + o β log b 2 + log L,b 2 t = log J t, P + o β log b log b 2 = log L,b 2 t log L,b t + o ˆβ = log b log b 2 log L,b 2 t log L,b t = β + log b log b 2 o Take b = γ, b 2 = γ 2, γ > γ 2 > 0 ˆβ β = γ γ 2 log o = o log How to kow t > J 0, P So estimatig J 0, P ot a roblem. L,b 0 τ b = L,b 0 = J 0, P + o Alteratively, take t 2 0.5,, take t 0, 0.5 b β L,b t 2 L,b t = J t 2 P J t P + o β log b + log L,b t 2 L,b t = log J t 2 P J t P + o ˆβ = log b log b 2 [ log L,b 2 t 2 L,b 2 t log L,b t 2 L,b t ] Take b = γ, b 2 = γ 2, > γ > γ 2 > 0, As before ˆβ β = o log 7

Two ste subsamlig: ˆτ = ˆβ L,b x ˆτ b = q a= ˆτ b ˆθ,b,a ˆθ x Ca show that su x L,b x ˆτ b J x, P 0. Problem: imrecise i small samles. E.g. i variatio estimatio, best choice of b gives error rate of O /3 but arameter estimates, if moel is true, gives O /2 error rate. Bootstra ivotal statistics, whe alicable, gives eve better tha O /2 error rate. 8